摘要 :
In this paper the author considers an autoregressive process where the parameters of the process are unknown and try to obtain pivots for predicting future observations. If we do a probabilistic prediction with the estimated model...
展开
In this paper the author considers an autoregressive process where the parameters of the process are unknown and try to obtain pivots for predicting future observations. If we do a probabilistic prediction with the estimated model, where the parameters are estimated by a sample of size n, we introduce an error of order n(-1) in the coverage probabilities of the prediction intervals. However we can reduce the order of the error if we calibrate adequately the estimated prediction bounds. The solution obtained can be expressed in terms of an approximate predictive pivot.
收起
摘要 :
This paper describes the methodology of providing multiprobability predictions for proteomic mass spectrometry data. The methodology is based on a newly developed machine learning framework called Venn machines. Is allows to outpu...
展开
This paper describes the methodology of providing multiprobability predictions for proteomic mass spectrometry data. The methodology is based on a newly developed machine learning framework called Venn machines. Is allows to output a valid probability interval. The methodology is designed for mass spectrometry data. For demonstrative purposes, we applied this methodology to MALDI-TOF data sets in order to predict the diagnosis of heart disease and early diagnoses of ovarian cancer and breast cancer. The experiments showed that probability intervals are narrow, that is, the output of the multiprobability predictor is similar to a single probability distribution. In addition, probability intervals produced for heart disease and ovarian cancer data were more accurate than the output of corresponding probability predictor. When Venn machines were forced to make point predictions, the accuracy of such predictions is for the most data better than the accuracy of the underlying algorithm that outputs single probability distribution of a label. Application of this methodology to MALDI-TOF data sets empirically demonstrates the validity. The accuracy of the proposed method on ovarian cancer data rises from 66.7 % 11 months in advance of the moment of diagnosis to up to 90.2 % at the moment of diagnosis. The same approach has been applied to heart disease data without time dependency, although the achieved accuracy was not as high (up to 69.9 %). The methodology allowed us to confirm mass spectrometry peaks previously identified as carrying statistically significant information for discrimination between controls and cases.
收起
摘要 :
Abstract In this study, we investigate probabilistic predictability for the El Ni?o‐Southern Oscillation (ENSO) by assessing both actual prediction skill and potential predictability using a long‐term retrospective forecast from...
展开
Abstract In this study, we investigate probabilistic predictability for the El Ni?o‐Southern Oscillation (ENSO) by assessing both actual prediction skill and potential predictability using a long‐term retrospective forecast from a complicated coupled general circulation model (CGCM). Our results indicate that above and below normal events are more predictable than neutral events. The probabilistic prediction skill suffers prominent “Spring Predictability Barrier” and undergoes notable interdecadal variation. For the above and below normal events, the lowest probabilistic prediction skills appear during 1920–1940 and the higher prediction skills occur after the 1960s. The seasonal and interdecadal variability of the probabilistic prediction skill stems mainly from the variability of the ENSO signal intensity. There is much room for improvement for the predictability of all three categories of ENSO events. At least an additional 1 or 2?months of skillful probabilistic predictions can be expected to progress in the future. To our knowledge, this is the first study to use a CGCM to evaluate probabilistic predictability for ENSO at various time scales.
收起
摘要 :
This paper discusses the development of a deforestation (D) prediction model using joint conditional probability. Ground truth was determined in Higashi-Shirakawa city, in the Gifu prefecture of Japan. Four related factors, consis...
展开
This paper discusses the development of a deforestation (D) prediction model using joint conditional probability. Ground truth was determined in Higashi-Shirakawa city, in the Gifu prefecture of Japan. Four related factors, consisting of geographic factors (slope, distance from the road, and distance from the forest and nonforest boundary) and one of three vegetation change detection (VCD) factors (NDVI, bandS, or spectral shape classification (SSC)), were used in direct and Bayes models to predict D. We tested two partitioning approaches, half-portion partitioning and systematic grid partitioning, in constructing the prediction models. In each approach, the study area was partitioned into two groups for training and validation and then reversed toverify the partitioning approach. The results of the half-portion partitioning were inconsistent, primarily because the half-portion partition is very large (about 80% of the D areas were found in one half portion). The systematic grid partition yieldeda better result than the half-portion partition. Although the accuracies of the direct and Bayes models were relatively close, the results of the Bayes model were more consistent. Similar prediction models could also be constructed to monitor other activities under the Kyoto Protocol, such as afforestation and reforestation.
收起
摘要 :
Abstract Understanding the relationship between probabilistic and deterministic predictabilities is important for climate predictability studies. Focusing on the actual skill of dynamical seasonal prediction, we previously found t...
展开
Abstract Understanding the relationship between probabilistic and deterministic predictabilities is important for climate predictability studies. Focusing on the actual skill of dynamical seasonal prediction, we previously found that the probabilistic skills of resolution and relative operating characteristic (ROC)/discrimination, but not reliability, have functional relationships with deterministic anomaly correlation (AC). Herein, we further investigate the relationship between probabilistic and deterministic seasonal potential predictabilities. The potential predictabilities are characterized by the potential skills of the AC, resolution, and ROC evaluated using the perfect-model framework, under which reliability is ideal and not considered. A theoretical argument demonstrates that similar theoretical relationships to those for actual skills exist between probabilistic and deterministic potential predictabilities, regardless of how different the potential predictabilities are from the corresponding actual skills. These theoretical relationships are strictly monotonic and characterized by symmetrical probabilistic predictabilities for the below- and above-normal categories, and lower predictability for the near-normal category corresponding to deterministic predictability. A subsequent diagnostic analysis reveals that while the probabilistic and deterministic potential predictabilities in current dynamical climate models differ noticeably from the corresponding actual skills, they exhibit quasi-monotonic relationships as expected theoretically, which effectively and quantitatively validates the theoretical argument. This work, combined with our previous findings, establishes a solid equivalence of the resolution and discrimination aspects of probabilistic predictability to deterministic predictability in seasonal prediction, which can have beneficial implications for further studying probabilistic predictability.
收起
摘要 :
Today's drivers of battery electric vehicles must deal with limited driving range in a sparse charging infrastructure. An accurate prediction of energy demand and driving range is therefore important and enables reliable routing a...
展开
Today's drivers of battery electric vehicles must deal with limited driving range in a sparse charging infrastructure. An accurate prediction of energy demand and driving range is therefore important and enables reliable routing and charge planning applications. Predictions of energy demand entail uncertainty, which can be considered directly with the use of probabilistic prediction algorithms. Machine learning algorithms are frequently applied in this context, but data used to train these algorithms are often distributed over a fleet of connected vehicles. Federated learning can be applied in this setting, but predictive uncertainty is typically not considered. We apply an extension of the federated averaging algorithm to learn probabilistic neural networks and linear regression models in a communication-efficient and privacy-preserving manner. We demonstrate the performance advantage of probabilistic prediction models over deterministic prediction models using proper scoring rules. Furthermore, we show that federated learning can improve the standard, driver-individual learning. Using probabilistic predictions, variable safety margins based on destination attainability can be applied, leading to increased effective driving range and reduced travel time.
收起
摘要 :
A new probabilistic forecast product, the Probability of RETurn (PRET), is introduced. PRET, the probability of occurrence of an event that corresponds to a specific return period, is computed from forecasts given by the ECMWF Ens...
展开
A new probabilistic forecast product, the Probability of RETurn (PRET), is introduced. PRET, the probability of occurrence of an event that corresponds to a specific return period, is computed from forecasts given by the ECMWF Ensemble Prediction System. It has been designed to provide easy-to-interpret and valuable information on the intensity and rarity of the expected severe weather, especially when the ensemble-based forecast distribution falls outside the model climate distribution. PRET definition relies on the Generalized Extreme Value family of distributions, which has been applied to study the statistics of the extremes in the model forecasts and observed datasets, and to estimate the levels corresponding to return periods not included in the datasets. PRET forecasts for the 2-metre maximum and minimum temperatures over Europe have been generated for six summer and six winter seasons (2003 to 2009). Case-studies have been used to illustrate that the new product is easier to interpret thanproducts that are now commonly used, such as probability forecasts and maps of Extreme Forecast Indices. Average diagnostics of PRET forecasts indicate that the skill in predicting extremely hot temperatures in the warm season is higher than the skill inpredicting extremely cold temperatures in the cold season.
收起
摘要 :
In this paper, a computational technique to deal with uncertainty in dynamic continuous models in Social Sciences is presented. Considering data from surveys, the method consists of determining the probability distribution of the ...
展开
In this paper, a computational technique to deal with uncertainty in dynamic continuous models in Social Sciences is presented. Considering data from surveys, the method consists of determining the probability distribution of the survey output and this allows to sample data and fit the model to the sampled data using a goodness-of-fit criterion based on the chi(2)-test. Taking the fitted parameters that were not rejected by the chi(2)-test, substituting them into the model and computing their outputs, 95% confidence intervals in each time instant capturing the uncertainty of the survey data (probabilistic estimation) is built. Using the same set of obtained model parameters, a prediction over the next few years with 95% confidence intervals (probabilistic prediction) is also provided. This technique is applied to a dynamic social model describing the evolution of the attitude of the Basque Country population towards the revolutionary organisation ETA. (C) 2015 Elsevier Inc. All rights reserved.
收起
摘要 :
In recent years, global warming mitigation has received increasing attention. Reasonable carbon prices in stable
carbon markets can reduce greenhouse gas emissions. Therefore, to accurately predict carbon prices and describe
the...
展开
In recent years, global warming mitigation has received increasing attention. Reasonable carbon prices in stable
carbon markets can reduce greenhouse gas emissions. Therefore, to accurately predict carbon prices and describe
their fluctuations, a hybrid model was developed based on Complete Empirical Mode Decomposition with
Adaptive Noise (CEEMDAN), SampEn (SE), Long Short-Term Memory (LSTM), Quantile regression (QR), and
Kernel density estimation (KDE). First, CEEMDAN was used to decompose the original carbon price series into
several Intrinsic Mode Functions (IMFs), and the SE of each decomposed series was calculated to reconstruct a
new series. The actual influencing factors determine the number of new series. The prediction model, named
QRLSTM, combines QR and LSTM to achieve point and interval predictions. Finally, KDE was used to obtain a
probabilistic prediction of the daily carbon price. Compared with other methods, the experimental results in
Hubei, China, showed that the proposed CEEMDAN-SE-QRLSTM model had the best performance. For point
prediction, the MSE, MAE, RMSE, MAPE, and R2 were 0.19, 0.33, 0.43, 0.01, and 91.7%, respectively. In interval
prediction, the coverage width criterion (CWC) in the 95%, 90%, and 80% confidence intervals was small, with
values of 0.37, 0.32, and 0.27, respectively. In probabilistic prediction, the continuous ranked probability score
(CRPS) with 95% confidence was small, with a value of 0.25. In addition, the improvement in point prediction
and stability of the hybrid model were also proved. The proposed model can provide more accurate results and
valuable information for decision-makers.
收起
摘要 :
We propose two Bayesian multinomial-Dirichlet models to predict the final outcome of football (soccer) matches and compare them to three well-known models regarding their predictive power. All the models predicted the full-time re...
展开
We propose two Bayesian multinomial-Dirichlet models to predict the final outcome of football (soccer) matches and compare them to three well-known models regarding their predictive power. All the models predicted the full-time results of 1710 matches of the first division of the Brazilian football championship and the comparison used three proper scoring rules, the proportion of errors and a calibration assessment. We also provide a goodness of fit measure. Our results show that multinomial-Dirichlet models are not only competitive with standard approaches, but they are also well calibrated and present reasonable goodness of fit.
收起